CryoGEM💎: Physics-Informed Generative Cryo-Electron Microscopy

Accepted by NeurIPS 2024
1ShanghaiTech University,
2Cellverse,
3iHuman Institute

* Indicates Equal Contribution, † Indicates the corresponding author
MY ALT TEXT

CryoGEM improves cryo-EM data analysis. Cryo-EM captures images of molecules in vitrified ice via electron beams. Data is processed for a high-resolution 3D reconstruction by a comprehensive pipeline. However, some modules like (a) particle picking and (d) ab-initio 3D reconstruction still lack high-quality training datasets. Given a coarse result as an input, CryoGEM can synthesize authentic single-particle micrographs as training dataset augmentation.

Abstract

In the past decade, deep conditional generative models have revolutionized the generation of realistic images, extending their application from entertainment to scientific domains. Single-particle cryo-electron microscopy (cryo-EM) is crucial in resolving near-atomic resolution 3D structures of proteins, such as the SARS-COV-2 spike protein. To achieve high-resolution reconstruction, a comprehensive data processing pipeline has been adopted. However, its performance is still limited as it lacks high-quality annotated datasets for training. To address this, we introduce physics-informed generative cryo-electron microscopy (CryoGEM), which for the first time integrates physics-based cryo-EM simulation with a generative unpaired noise translation to generate physically correct synthetic cryo-EM datasets with realistic noises. Initially, CryoGEM simulates the cryo-EM imaging process based on a virtual specimen. To generate realistic noises, we leverage an unpaired noise translation via contrastive learning with a novel mask-guided sampling scheme. Extensive experiments show that CryoGEM is capable of generating authentic cryo-EM images. The generated dataset can used as training data for particle picking and pose estimation models, eventually improving the reconstruction resolution.

Pipeline

overview of pipeline
We begin by creating a virtual specimen containing various initial reconstruction results. We then simulate the imaging process of cryo-EM, incorporating physical priors such as ice gradient and point spread function (PSF) to generate a physical simulation. By adding simple Gaussian noise to the physically simulated results, we introduce randomness within a contrastive learning framework. To enhance training efficiency and performance, we use the particle-background mask as a guide for patch sampling. The sampled positive and negative instances are then encoded into multi-scale features for contrastive learning. Additionally, we introduce an adversarial loss to ensure realistic cryo-EM image synthesis.

Conditional Micrograph Generation

overview of pipeline
(a) Location-controlled generation, (b) ice-gradient-controlled generation. (c, d) Zero-shot transfer between particle and noise.

Conditional Particle Generation

overview of pipeline
Beyond Micrograph-level controls, CryoGEM supports fine-grained per-particle conditions including pose, conformation and defocus value.

Improving Particle Picking

overview of pipeline
Qualitative comparison results of particle picking. The blue circles indicate matches with manual picking results, while the red circles represent misses or excess picks by the model.

BibTeX

@inproceedings{zhang2024cryogem,
        title={CryoGEM: Physics-Informed Generative Cryo-Electron Microscopy},
        author={Zhang, Jiakai and Chen, Qihe and Zeng, Yan and Gao, Wenyuan and He, Xuming and Liu, Zhijie and Yu, Jingyi},
        booktitle={Proceedings of the 38th International Conference on Neural Information Processing Systems},
        year={2024}
      }